首页> 外文OA文献 >Empirical analysis and evaluation of approximate techniques for pruning regression bagging ensembles
【2h】

Empirical analysis and evaluation of approximate techniques for pruning regression bagging ensembles

机译:回归套袋合奏的近似技术的经验分析和评估

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Identifying the optimal subset of regressors in a regression bagging ensemble is a difficult task that has exponential cost in the size of the ensemble. In this article we analyze two approximate techniques especially devised to address this problem. The first strategy constructs a relaxed version of the problem that can be solved using semidefinite programming. The second one is based on modifying the order of aggregation of the regressors. Ordered aggregation is a simple forward selection algorithm that incorporates at each step the regressor that reduces the training error of the current subensemble the most. Both techniques can be used to identify subensembles that are close to the optimal ones, which can be obtained by exhaustive search at a larger computational cost. Experiments in a wide variety of synthetic and real-world regression problems show that pruned ensembles composed of only 20% of the initial regressors often have better generalization performance than the original bagging ensembles. These improvements are due to a reduction in the bias and the covariance components of the generalization error. Subensembles obtained using either SDP or ordered aggregation generally outperform subensembles obtained by other ensemble pruning methods and ensembles generated by the Adaboost.R2 algorithm, negative correlation learning or regularized linear stacked generalization. Ordered aggregation has a slightly better overall performance than SDP in the problems investigated. However, the difference is not statistically significant. Ordered aggregation has the further advantage that it produces a nested sequence of near-optimal subensembles of increasing size with no additional computational cost. © 2011 Elsevier B.V.
机译:在回归装袋合奏中确定回归器的最佳子集是一项艰巨的任务,在合奏的大小上具有成倍的成本。在本文中,我们分析了两种专门设计用于解决此问题的近似技术。第一种策略构建了可以使用半定编程解决的问题的宽松版本。第二个基于修改回归器的聚合顺序。有序聚合是一种简单的前向选择算法,该算法在每一步都包含了回归器,该回归器最大程度地减少了当前子集合的训练误差。两种技术都可用于识别接近最佳子组件的子组件,这可以通过穷举搜索以较大的计算成本获得。在各种各样的合成和现实回归问题中进行的实验表明,仅由20%的初始回归变量构成的修剪后的合奏通常比原始的套袋合奏具有更好的泛化性能。这些改进归因于归纳误差的偏倚和协方差成分的减少。使用SDP或有序聚合获得的子集成通常优于通过其他整体修剪方法获得的子集成以及由Adaboost.R2算法,负相关学习或正则化线性堆叠泛化生成的集成。在调查的问题上,有序聚合的总体性能比SDP略好。但是,差异在统计上并不显着。有序聚合的另一个优势是,它无需增加计算成本即可生成大小不断增加的近优子组件的嵌套序列。 ©2011 Elsevier B.V.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号